Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 678013 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 44.0 MiB |
| Average record size in memory | 68.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 4 |
DrivAge is highly correlated with BonusMalus | High correlation |
BonusMalus is highly correlated with DrivAge | High correlation |
Area is highly correlated with Density and 1 other fields | High correlation |
DrivAge is highly correlated with BonusMalus | High correlation |
BonusMalus is highly correlated with DrivAge | High correlation |
Density is highly correlated with Area and 1 other fields | High correlation |
Region is highly correlated with Area and 1 other fields | High correlation |
IDpol has unique values | Unique |
ClaimNb has 643953 (95.0%) zeros | Zeros |
VehAge has 57739 (8.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-03-02 08:16:41.898343 |
|---|---|
| Analysis finished | 2022-03-02 08:17:13.176074 |
| Duration | 31.28 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 678013 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2621856.921 |
| Minimum | 1 |
|---|---|
| Maximum | 6114330 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 69365.6 |
| Q1 | 1157951 |
| median | 2272152 |
| Q3 | 4046274 |
| 95-th percentile | 6014195.2 |
| Maximum | 6114330 |
| Range | 6114329 |
| Interquartile range (IQR) | 2888323 |
Descriptive statistics
| Standard deviation | 1641782.753 |
|---|---|
| Coefficient of variation (CV) | 0.6261908266 |
| Kurtosis | -0.6583474996 |
| Mean | 2621856.921 |
| Median Absolute Deviation (MAD) | 1152062 |
| Skewness | 0.2378901399 |
| Sum | 1.777653077 × 1012 |
| Variance | 2.695450607 × 1012 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2126260 | 1 | < 0.1% |
| 64613 | 1 | < 0.1% |
| 1100330 | 1 | < 0.1% |
| 3251083 | 1 | < 0.1% |
| 52131 | 1 | < 0.1% |
| 3160184 | 1 | < 0.1% |
| 3149256 | 1 | < 0.1% |
| 2017578 | 1 | < 0.1% |
| 3017221 | 1 | < 0.1% |
| 3240792 | 1 | < 0.1% |
| Other values (678003) | 678003 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 5 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 15 | 1 | |
| 17 | 1 | |
| 18 | 1 | |
| 21 | 1 |
| Value | Count | Frequency (%) |
| 6114330 | 1 | |
| 6114329 | 1 | |
| 6114328 | 1 | |
| 6114327 | 1 | |
| 6114326 | 1 | |
| 6114325 | 1 | |
| 6114324 | 1 | |
| 6114323 | 1 | |
| 6114322 | 1 | |
| 6114321 | 1 |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.05324676665 |
| Minimum | 0 |
|---|---|
| Maximum | 16 |
| Zeros | 643953 |
| Zeros (%) | 95.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 1 |
| Maximum | 16 |
| Range | 16 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.2401173304 |
|---|---|
| Coefficient of variation (CV) | 4.509519461 |
| Kurtosis | 76.84187999 |
| Mean | 0.05324676665 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 5.599613312 |
| Sum | 36102 |
| Variance | 0.05765633238 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=11)
| Value | Count | Frequency (%) |
| 0 | 643953 | |
| 1 | 32178 | 4.7% |
| 2 | 1784 | 0.3% |
| 3 | 82 | < 0.1% |
| 4 | 7 | < 0.1% |
| 11 | 3 | < 0.1% |
| 5 | 2 | < 0.1% |
| 6 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 643953 | |
| 1 | 32178 | 4.7% |
| 2 | 1784 | 0.3% |
| 3 | 82 | < 0.1% |
| 4 | 7 | < 0.1% |
| 5 | 2 | < 0.1% |
| 6 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 9 | 1 | < 0.1% |
| 11 | 3 | < 0.1% |
| Value | Count | Frequency (%) |
| 16 | 1 | < 0.1% |
| 11 | 3 | < 0.1% |
| 9 | 1 | < 0.1% |
| 8 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 5 | 2 | < 0.1% |
| 4 | 7 | < 0.1% |
| 3 | 82 | < 0.1% |
| 2 | 1784 | 0.3% |
| 1 | 32178 |
Exposure
Real number (ℝ≥0)
| Distinct | 187 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5287501058 |
| Minimum | 0.00273224 |
|---|---|
| Maximum | 2.01 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 0.00273224 |
|---|---|
| 5-th percentile | 0.04 |
| Q1 | 0.18 |
| median | 0.49 |
| Q3 | 0.99 |
| 95-th percentile | 1 |
| Maximum | 2.01 |
| Range | 2.00726776 |
| Interquartile range (IQR) | 0.81 |
Descriptive statistics
| Standard deviation | 0.3644415463 |
|---|---|
| Coefficient of variation (CV) | 0.6892510136 |
| Kurtosis | -1.524243691 |
| Mean | 0.5287501058 |
| Median Absolute Deviation (MAD) | 0.37 |
| Skewness | 0.08531780026 |
| Sum | 358499.4455 |
| Variance | 0.1328176407 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 168125 | |
| 0.08 | 44670 | 6.6% |
| 0.07 | 12969 | 1.9% |
| 0.24 | 12950 | 1.9% |
| 0.5 | 12497 | 1.8% |
| 0.49 | 12298 | 1.8% |
| 0.03 | 11996 | 1.8% |
| 0.04 | 11131 | 1.6% |
| 0.12 | 11047 | 1.6% |
| 0.2 | 8727 | 1.3% |
| Other values (177) | 371603 |
| Value | Count | Frequency (%) |
| 0.00273224 | 295 | < 0.1% |
| 0.002732240437 | 765 | |
| 0.002739726 | 312 | < 0.1% |
| 0.002739726027 | 1733 | |
| 0.005464480874 | 464 | 0.1% |
| 0.005464481 | 145 | < 0.1% |
| 0.005479452 | 355 | 0.1% |
| 0.005479452055 | 1041 | |
| 0.008196721 | 113 | < 0.1% |
| 0.008196721311 | 507 | 0.1% |
| Value | Count | Frequency (%) |
| 2.01 | 2 | |
| 2 | 1 | |
| 1.99 | 1 | |
| 1.98 | 1 | |
| 1.93 | 1 | |
| 1.92 | 1 | |
| 1.9 | 2 | |
| 1.88 | 1 | |
| 1.85 | 2 | |
| 1.82 | 1 |
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 662.5 KiB |
| C | |
|---|---|
| D | |
| E | |
| A | |
| B |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | D |
|---|---|
| 2nd row | D |
| 3rd row | B |
| 4th row | B |
| 5th row | B |
Common Values
| Value | Count | Frequency (%) |
| C | 191880 | |
| D | 151596 | |
| E | 137167 | |
| A | 103957 | |
| B | 75459 | 11.1% |
| F | 17954 | 2.6% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| c | 191880 | |
| d | 151596 | |
| e | 137167 | |
| a | 103957 | |
| b | 75459 | 11.1% |
| f | 17954 | 2.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
VehPower
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.454631401 |
| Minimum | 4 |
|---|---|
| Maximum | 15 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 4 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 5 |
| median | 6 |
| Q3 | 7 |
| 95-th percentile | 11 |
| Maximum | 15 |
| Range | 11 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.050905698 |
|---|---|
| Coefficient of variation (CV) | 0.3177417222 |
| Kurtosis | 1.668206924 |
| Mean | 6.454631401 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 1.17134444 |
| Sum | 4376324 |
| Variance | 4.206214181 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=12)
| Value | Count | Frequency (%) |
| 6 | 148976 | |
| 7 | 145401 | |
| 5 | 124821 | |
| 4 | 115349 | |
| 8 | 46956 | 6.9% |
| 10 | 31354 | 4.6% |
| 9 | 30085 | 4.4% |
| 11 | 18352 | 2.7% |
| 12 | 8214 | 1.2% |
| 13 | 3229 | 0.5% |
| Other values (2) | 5276 | 0.8% |
| Value | Count | Frequency (%) |
| 4 | 115349 | |
| 5 | 124821 | |
| 6 | 148976 | |
| 7 | 145401 | |
| 8 | 46956 | 6.9% |
| 9 | 30085 | 4.4% |
| 10 | 31354 | 4.6% |
| 11 | 18352 | 2.7% |
| 12 | 8214 | 1.2% |
| 13 | 3229 | 0.5% |
| Value | Count | Frequency (%) |
| 15 | 2926 | 0.4% |
| 14 | 2350 | 0.3% |
| 13 | 3229 | 0.5% |
| 12 | 8214 | 1.2% |
| 11 | 18352 | 2.7% |
| 10 | 31354 | 4.6% |
| 9 | 30085 | 4.4% |
| 8 | 46956 | 6.9% |
| 7 | 145401 | |
| 6 | 148976 |
| Distinct | 78 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.044264638 |
| Minimum | 0 |
|---|---|
| Maximum | 100 |
| Zeros | 57739 |
| Zeros (%) | 8.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 6 |
| Q3 | 11 |
| 95-th percentile | 17 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 5.66623158 |
|---|---|
| Coefficient of variation (CV) | 0.804375172 |
| Kurtosis | 6.522053975 |
| Mean | 7.044264638 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 1.147988998 |
| Sum | 4776103 |
| Variance | 32.10618032 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1 | 71284 | 10.5% |
| 2 | 59124 | 8.7% |
| 0 | 57739 | 8.5% |
| 3 | 50261 | 7.4% |
| 4 | 43492 | 6.4% |
| 5 | 38737 | 5.7% |
| 10 | 38395 | 5.7% |
| 6 | 35717 | 5.3% |
| 7 | 32880 | 4.8% |
| 8 | 32680 | 4.8% |
| Other values (68) | 217704 |
| Value | Count | Frequency (%) |
| 0 | 57739 | |
| 1 | 71284 | |
| 2 | 59124 | |
| 3 | 50261 | |
| 4 | 43492 | |
| 5 | 38737 | |
| 6 | 35717 | |
| 7 | 32880 | |
| 8 | 32680 | |
| 9 | 31922 |
| Value | Count | Frequency (%) |
| 100 | 25 | |
| 99 | 23 | |
| 85 | 1 | < 0.1% |
| 84 | 1 | < 0.1% |
| 83 | 2 | < 0.1% |
| 82 | 1 | < 0.1% |
| 81 | 3 | < 0.1% |
| 80 | 3 | < 0.1% |
| 79 | 1 | < 0.1% |
| 78 | 1 | < 0.1% |
| Distinct | 83 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45.4991217 |
| Minimum | 18 |
|---|---|
| Maximum | 100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 25 |
| Q1 | 34 |
| median | 44 |
| Q3 | 55 |
| 95-th percentile | 72 |
| Maximum | 100 |
| Range | 82 |
| Interquartile range (IQR) | 21 |
Descriptive statistics
| Standard deviation | 14.13744407 |
|---|---|
| Coefficient of variation (CV) | 0.3107190545 |
| Kurtosis | -0.3426884401 |
| Mean | 45.4991217 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.4357585748 |
| Sum | 30848996 |
| Variance | 199.867325 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 36 | 17530 | 2.6% |
| 38 | 17346 | 2.6% |
| 39 | 17320 | 2.6% |
| 37 | 17295 | 2.6% |
| 52 | 17195 | 2.5% |
| 34 | 17059 | 2.5% |
| 40 | 17017 | 2.5% |
| 51 | 17016 | 2.5% |
| 41 | 16977 | 2.5% |
| 42 | 16953 | 2.5% |
| Other values (73) | 506305 |
| Value | Count | Frequency (%) |
| 18 | 748 | 0.1% |
| 19 | 2392 | 0.4% |
| 20 | 3676 | 0.5% |
| 21 | 4437 | 0.7% |
| 22 | 5291 | |
| 23 | 6261 | |
| 24 | 7393 | |
| 25 | 8697 | |
| 26 | 10301 | |
| 27 | 11827 |
| Value | Count | Frequency (%) |
| 100 | 3 | < 0.1% |
| 99 | 70 | |
| 98 | 5 | < 0.1% |
| 97 | 10 | < 0.1% |
| 96 | 15 | < 0.1% |
| 95 | 24 | < 0.1% |
| 94 | 32 | < 0.1% |
| 93 | 55 | |
| 92 | 66 | |
| 91 | 121 |
| Distinct | 115 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.76150162 |
| Minimum | 50 |
|---|---|
| Maximum | 230 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 50 |
|---|---|
| 5-th percentile | 50 |
| Q1 | 50 |
| median | 50 |
| Q3 | 64 |
| 95-th percentile | 95 |
| Maximum | 230 |
| Range | 180 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 15.63665766 |
|---|---|
| Coefficient of variation (CV) | 0.2616510167 |
| Kurtosis | 2.674811214 |
| Mean | 59.76150162 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.728934068 |
| Sum | 40519075 |
| Variance | 244.5050628 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 50 | 384156 | |
| 100 | 19530 | 2.9% |
| 68 | 18791 | 2.8% |
| 72 | 18580 | 2.7% |
| 76 | 18226 | 2.7% |
| 64 | 18192 | 2.7% |
| 80 | 18086 | 2.7% |
| 57 | 17938 | 2.6% |
| 60 | 17363 | 2.6% |
| 54 | 17360 | 2.6% |
| Other values (105) | 129791 | 19.1% |
| Value | Count | Frequency (%) |
| 50 | 384156 | |
| 51 | 15869 | 2.3% |
| 52 | 4770 | 0.7% |
| 53 | 3351 | 0.5% |
| 54 | 17360 | 2.6% |
| 55 | 5593 | 0.8% |
| 56 | 3453 | 0.5% |
| 57 | 17938 | 2.6% |
| 58 | 5970 | 0.9% |
| 59 | 2779 | 0.4% |
| Value | Count | Frequency (%) |
| 230 | 1 | < 0.1% |
| 228 | 1 | < 0.1% |
| 218 | 1 | < 0.1% |
| 208 | 1 | < 0.1% |
| 198 | 2 | < 0.1% |
| 196 | 3 | |
| 195 | 6 | |
| 190 | 3 | |
| 187 | 3 | |
| 185 | 5 |
VehBrand
Categorical
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 662.6 KiB |
| B12 | |
|---|---|
| B1 | |
| B2 | |
| B3 | |
| B5 | |
| Other values (6) |
Length
| Max length | 3 |
|---|---|
| Median length | 2 |
| Mean length | 2.314951188 |
| Min length | 2 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | B12 |
|---|---|
| 2nd row | B12 |
| 3rd row | B12 |
| 4th row | B12 |
| 5th row | B12 |
Common Values
| Value | Count | Frequency (%) |
| B12 | 166024 | |
| B1 | 162736 | |
| B2 | 159861 | |
| B3 | 53395 | 7.9% |
| B5 | 34753 | 5.1% |
| B6 | 28548 | 4.2% |
| B4 | 25179 | 3.7% |
| B10 | 17707 | 2.6% |
| B11 | 13585 | 2.0% |
| B13 | 12178 | 1.8% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| b12 | 166024 | |
| b1 | 162736 | |
| b2 | 159861 | |
| b3 | 53395 | 7.9% |
| b5 | 34753 | 5.1% |
| b6 | 28548 | 4.2% |
| b4 | 25179 | 3.7% |
| b10 | 17707 | 2.6% |
| b11 | 13585 | 2.0% |
| b13 | 12178 | 1.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
VehGas
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 662.4 KiB |
| Regular | |
|---|---|
| Diesel |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.510133287 |
| Min length | 6 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Regular |
|---|---|
| 2nd row | Regular |
| 3rd row | Diesel |
| 4th row | Diesel |
| 5th row | Diesel |
Common Values
| Value | Count | Frequency (%) |
| Regular | 345877 | |
| Diesel | 332136 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| regular | 345877 | |
| diesel | 332136 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 1607 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1792.422405 |
| Minimum | 1 |
|---|---|
| Maximum | 27000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.2 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 92 |
| median | 393 |
| Q3 | 1658 |
| 95-th percentile | 7313 |
| Maximum | 27000 |
| Range | 26999 |
| Interquartile range (IQR) | 1566 |
Descriptive statistics
| Standard deviation | 3958.646564 |
|---|---|
| Coefficient of variation (CV) | 2.208545571 |
| Kurtosis | 24.86945063 |
| Mean | 1792.422405 |
| Median Absolute Deviation (MAD) | 355 |
| Skewness | 4.65142115 |
| Sum | 1215285692 |
| Variance | 15670882.62 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 27000 | 10515 | 1.6% |
| 3317 | 9891 | 1.5% |
| 1313 | 7157 | 1.1% |
| 9307 | 5986 | 0.9% |
| 3744 | 5540 | 0.8% |
| 1326 | 5447 | 0.8% |
| 405 | 5195 | 0.8% |
| 4128 | 5055 | 0.7% |
| 4762 | 4985 | 0.7% |
| 57 | 4262 | 0.6% |
| Other values (1597) | 613980 |
| Value | Count | Frequency (%) |
| 1 | 7 | < 0.1% |
| 2 | 92 | < 0.1% |
| 3 | 304 | < 0.1% |
| 4 | 274 | < 0.1% |
| 5 | 438 | 0.1% |
| 6 | 752 | 0.1% |
| 7 | 1088 | 0.2% |
| 8 | 1131 | 0.2% |
| 9 | 1813 | |
| 10 | 2911 |
| Value | Count | Frequency (%) |
| 27000 | 10515 | |
| 23396 | 66 | < 0.1% |
| 22821 | 182 | < 0.1% |
| 22669 | 463 | 0.1% |
| 21410 | 76 | < 0.1% |
| 20000 | 6 | < 0.1% |
| 18229 | 200 | < 0.1% |
| 17140 | 910 | 0.1% |
| 16533 | 613 | 0.1% |
| 16291 | 175 | < 0.1% |
| Distinct | 22 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 663.0 KiB |
| R24 | |
|---|---|
| R82 | |
| R93 | |
| R11 | |
| R53 | |
| Other values (17) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | R82 |
|---|---|
| 2nd row | R82 |
| 3rd row | R22 |
| 4th row | R72 |
| 5th row | R72 |
Common Values
| Value | Count | Frequency (%) |
| R24 | 160601 | |
| R82 | 84752 | |
| R93 | 79315 | |
| R11 | 69791 | |
| R53 | 42122 | 6.2% |
| R52 | 38751 | 5.7% |
| R91 | 35805 | 5.3% |
| R72 | 31329 | 4.6% |
| R31 | 27285 | 4.0% |
| R54 | 19046 | 2.8% |
| Other values (12) | 89216 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| r24 | 160601 | |
| r82 | 84752 | |
| r93 | 79315 | |
| r11 | 69791 | |
| r53 | 42122 | 6.2% |
| r52 | 38751 | 5.7% |
| r91 | 35805 | 5.3% |
| r72 | 31329 | 4.6% |
| r31 | 27285 | 4.0% |
| r54 | 19046 | 2.8% |
| Other values (12) | 89216 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| IDpol | ClaimNb | Exposure | Area | VehPower | VehAge | DrivAge | BonusMalus | VehBrand | VehGas | Density | Region | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 1 | 0.10 | D | 5 | 0 | 55 | 50 | B12 | Regular | 1217 | R82 |
| 1 | 3.0 | 1 | 0.77 | D | 5 | 0 | 55 | 50 | B12 | Regular | 1217 | R82 |
| 2 | 5.0 | 1 | 0.75 | B | 6 | 2 | 52 | 50 | B12 | Diesel | 54 | R22 |
| 3 | 10.0 | 1 | 0.09 | B | 7 | 0 | 46 | 50 | B12 | Diesel | 76 | R72 |
| 4 | 11.0 | 1 | 0.84 | B | 7 | 0 | 46 | 50 | B12 | Diesel | 76 | R72 |
| 5 | 13.0 | 1 | 0.52 | E | 6 | 2 | 38 | 50 | B12 | Regular | 3003 | R31 |
| 6 | 15.0 | 1 | 0.45 | E | 6 | 2 | 38 | 50 | B12 | Regular | 3003 | R31 |
| 7 | 17.0 | 1 | 0.27 | C | 7 | 0 | 33 | 68 | B12 | Diesel | 137 | R91 |
| 8 | 18.0 | 1 | 0.71 | C | 7 | 0 | 33 | 68 | B12 | Diesel | 137 | R91 |
| 9 | 21.0 | 1 | 0.15 | B | 7 | 0 | 41 | 50 | B12 | Diesel | 60 | R52 |
Last rows
| IDpol | ClaimNb | Exposure | Area | VehPower | VehAge | DrivAge | BonusMalus | VehBrand | VehGas | Density | Region | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 678003 | 6114321.0 | 0 | 0.005479 | E | 4 | 0 | 29 | 80 | B12 | Regular | 5360 | R11 |
| 678004 | 6114322.0 | 0 | 0.005479 | E | 11 | 0 | 49 | 74 | B12 | Diesel | 5360 | R11 |
| 678005 | 6114323.0 | 0 | 0.005479 | D | 4 | 0 | 34 | 80 | B12 | Regular | 731 | R82 |
| 678006 | 6114324.0 | 0 | 0.005479 | D | 11 | 0 | 41 | 50 | B12 | Diesel | 528 | R93 |
| 678007 | 6114325.0 | 0 | 0.005479 | E | 6 | 4 | 40 | 68 | B12 | Regular | 2733 | R93 |
| 678008 | 6114326.0 | 0 | 0.002740 | E | 4 | 0 | 54 | 50 | B12 | Regular | 3317 | R93 |
| 678009 | 6114327.0 | 0 | 0.002740 | E | 4 | 0 | 41 | 95 | B12 | Regular | 9850 | R11 |
| 678010 | 6114328.0 | 0 | 0.002740 | D | 6 | 2 | 45 | 50 | B12 | Diesel | 1323 | R82 |
| 678011 | 6114329.0 | 0 | 0.002740 | B | 4 | 0 | 60 | 50 | B12 | Regular | 95 | R26 |
| 678012 | 6114330.0 | 0 | 0.002740 | B | 7 | 6 | 29 | 54 | B12 | Diesel | 65 | R72 |